Goto

Collaborating Authors

 latexit latexitsha1


Exploration with Multi-Sample Target Values for Distributional Reinforcement Learning

Teng, Michael, van de Panne, Michiel, Wood, Frank

arXiv.org Artificial Intelligence

Distributional reinforcement learning (RL) aims to learn a value-network that predicts the full distribution of the returns for a given state, often modeled via a quantile-based critic. This approach has been successfully integrated into common RL methods for continuous control, giving rise to algorithms such as Distributional Soft Actor-Critic (DSAC). In this paper, we introduce multi-sample target values (MTV) for distributional RL, as a principled replacement for single-sample target value estimation, as commonly employed in current practice. The improved distributional estimates further lend themselves to UCB-based exploration. These two ideas are combined to yield our distributional RL algorithm, E2DC (Extra Exploration with Distributional Critics). We evaluate our approach on a range of continuous control tasks and demonstrate state-of-the-art model-free performance on difficult tasks such as Humanoid control. We provide further insight into the method via visualization and analysis of the learned distributions and their evolution during training.


IsoNN: Isomorphic Neural Network for Graph Representation Learning and Classification

Meng, Lin, Zhang, Jiawei

arXiv.org Machine Learning

Deep learning models have achieved huge success in numerous fields, such as computer vision and natural language processing. However, unlike such fields, it is hard to apply traditional deep learning models on the graph data due to the `node-orderless' property. Normally, we use an adjacent matrix to represent a graph, but an artificial and random node-order will be cast on the graphs, which renders the performance of deep models extremely erratic and not robust. In order to eliminate the unnecessary node-order constraint, in this paper, we propose a novel model named Isomorphic Neural Network (IsoNN), which learns the graph representation by extracting its isomorphic features via the graph matching between input graph and templates. IsoNN has two main components: graph isomorphic feature extraction component and classification component. The graph isomorphic feature extraction component utilizes a set of subgraph templates as the kernel variables to learn the possible subgraph patterns existing in the input graph and then computes the isomorphic features. A set of permutation matrices is used in the component to break the node-order brought by the matrix representation. To further lower down the computational cost and identify the optimal subgraph patterns, IsoNN adopts two min-pooling layers to find the optimal matching. The first min-pooling layer aims at finding the best permutation matrix, whereas the second one is used to determine the best templates for the input graph data. Three fully-connected layers are used as the classification component in IsoNN. Extensive experiments are conducted on real-world datasets, and the experimental results demonstrate both the effectiveness and efficiency of IsoNN.


Batch Uniformization for Minimizing Maximum Anomaly Score of DNN-based Anomaly Detection in Sounds

Koizumi, Yuma, Saito, Shoichiro, Yamaguchi, Masataka, Murata, Shin, Harada, Noboru

arXiv.org Machine Learning

Use of an autoencoder (AE) as a normal model is a state-of-the-art technique for unsupervised-anomaly detection in sounds (ADS). The AE is trained to minimize the sample mean of the anomaly score of normal sounds in a mini-batch. One problem with this approach is that the anomaly score of rare-normal sounds becomes higher than that of frequent-normal sounds, because the sample mean is strongly affected by frequent-normal samples, resulting in preferentially decreasing the anomaly score of frequent-normal samples. To decrease anomaly scores for both frequent- and rare-normal sounds, we propose batch uniformization, a training method for unsupervised-ADS for minimizing a weighted average of the anomaly score on each sample in a mini-batch. We used the reciprocal of the probabilistic density of each sample as the weight, more intuitively, a large weight is given for rare-normal sounds. Such a weight works to give a constant anomaly score for both frequent- and rare-normal sounds. Since the probabilistic density is unknown, we estimate it by using the kernel density estimation on each training mini-batch. Verification- and objective-experiments show that the proposed batch uniformization improves the performance of unsupervised-ADS.


Multivariate Time Series Imputation with Variational Autoencoders

Fortuin, Vincent, Rätsch, Gunnar, Mandt, Stephan

arXiv.org Machine Learning

Time series are often associated with missing values, for instance due to faulty measurement devices, partially observed states, or costly measurement procedures [15]. These missing values impair the usefulness and interpretability of the data, leading to the problem of data imputation: estimating those missing values from the observed ones [38]. Multivariate time series, consisting of multiple correlated univariate time series or channels, give rise to two distinct ways of imputing missing information: (1) by exploiting temporal correlations within each channel, and (2) by exploiting correlations across channels, for example by using lowerdimensional representations of the data. For instance in a medical setting, if the blood pressure of a patient is unobserved, it can be informative that the heart rate at the current time is higher than normal and that the blood pressure was also elevated an hour ago. An ideal imputation model for multivariate time series should therefore take both of these sources of information into account.


Twin Auxiliary Classifiers GAN

Gong, Mingming, Xu, Yanwu, Li, Chunyuan, Zhang, Kun, Batmanghelich, Kayhan

arXiv.org Machine Learning

Conditional generative models enjoy remarkable progress over the past few years. One of the popular conditional models is Auxiliary Classifier GAN (AC-GAN), which generates highly discriminative images by extending the loss function of GAN with an auxiliary classifier. However, the diversity of the generated samples by AC-GAN tends to decrease as the number of classes increases, hence limiting its power on large-scale data. In this paper, we identify the source of the low diversity issue theoretically and propose a practical solution to solve the problem. We show that the auxiliary classifier in AC-GAN imposes perfect separability, which is disadvantageous when the supports of the class distributions have significant overlap. To address the issue, we propose Twin Auxiliary Classifiers Generative Adversarial Net (TAC-GAN) that further benefits from a new player that interacts with other players (the generator and the discriminator) in GAN. Theoretically, we demonstrate that TAC-GAN can effectively minimize the divergence between the generated and real-data distributions. Extensive experimental results show that our TAC-GAN can successfully replicate the true data distributions on simulated data, and significantly improves the diversity of class-conditional image generation on real datasets.


Quantifying Confounding Bias in Neuroimaging Datasets with Causal Inference

Wachinger, Christian, Becker, Benjamin Gutierrez, Rieckmann, Anna, Pölsterl, Sebastian

arXiv.org Machine Learning

Neuroimaging datasets keep growing in size to address increasingly complex medical questions. However, even the largest datasets today alone are too small for training complex machine learning models. A potential solution is to increase sample size by pooling scans from several datasets. In this work, we combine 12,207 MRI scans from 15 studies and show that simple pooling is often ill-advised due to introducing various types of biases in the training data. First, we systematically define these biases. Second, we detect bias by experimentally showing that scans can be correctly assigned to their respective dataset with 73.3% accuracy. Finally, we propose to tell causal from confounding factors by quantifying the extent of confounding and causality in a single dataset using causal inference. We achieve this by finding the simplest graphical model in terms of Kolmogorov complexity. As Kolmogorov complexity is not directly computable, we employ the minimum description length to approximate it. We empirically show that our approach is able to estimate plausible causal relationships from real neuroimaging data.


ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection

Jha, Saurabh, Banerjee, Subho S., Tsai, Timothy, Hari, Siva K. S., Sullivan, Michael B., Kalbarczyk, Zbigniew T., Keckler, Stephen W., Iyer, Ravishankar K.

arXiv.org Machine Learning

The safety and resilience of fully autonomous vehicles (AVs) are of significant concern, as exemplified by several headline-making accidents. While AV development today involves verification, validation, and testing, end-to-end assessment of AV systems under accidental faults in realistic driving scenarios has been largely unexplored. This paper presents DriveFI, a machine learning-based fault injection engine, which can mine situations and faults that maximally impact AV safety, as demonstrated on two industry-grade AV technology stacks (from NVIDIA and Baidu). For example, DriveFI found 561 safety-critical faults in less than 4 hours. In comparison, random injection experiments executed over several weeks could not find any safety-critical faults


Universal audio synthesizer control with normalizing flows

Esling, Philippe, Masuda, Naotake, Bardet, Adrien, Despres, Romeo, Chemla--Romeu-Santos, Axel

arXiv.org Machine Learning

The ubiquity of sound synthesizers has reshaped music production and even entirely defined new music genres. However, the increasing complexity and number of parameters in modern synthesizers make them harder to master. Hence, the development of methods allowing to easily create and explore with synthesizers is a crucial need. Here, we introduce a novel formulation of audio synthesizer control. We formalize it as finding an organized latent audio space that represents the capabilities of a synthesizer, while constructing an invertible mapping to the space of its parameters. By using this formulation, we show that we can address simultaneously automatic parameter inference, macro-control learning and audio-based preset exploration within a single model. To solve this new formulation, we rely on Variational Auto-Encoders (VAE) and Normalizing Flows (NF) to organize and map the respective auditory and parameter spaces. We introduce the disentangling flows, which allow to perform the invertible mapping between separate latent spaces, while steering the organization of some latent dimensions to match target variation factors by splitting the objective as partial density evaluation. We evaluate our proposal against a large set of baseline models and show its superiority in both parameter inference and audio reconstruction. We also show that the model disentangles the major factors of audio variations as latent dimensions, that can be directly used as macro-parameters. We also show that our model is able to learn semantic controls of a synthesizer by smoothly mapping to its parameters. Finally, we discuss the use of our model in creative applications and its real-time implementation in Ableton Live


EGG: a toolkit for research on Emergence of lanGuage in Games

Kharitonov, Eugene, Chaabouni, Rahma, Bouchacourt, Diane, Baroni, Marco

arXiv.org Artificial Intelligence

There is renewed interest in simulating language emergence among deep neural agents that communicate to jointly solve a task, spurred by the practical aim to develop language-enabled interactive AIs, as well as by theoretical questions about the evolution of human language. However, optimizing deep architectures connected by a discrete communication channel (such as that in which language emerges) is technically challenging. We introduce EGG, a toolkit that greatly simplifies the implementation of emergent-language communication games. EGG's modular design provides a set of building blocks that the user can combine to create new games, easily navigating the optimization and architecture space. We hope that the tool will lower the technical barrier, and encourage researchers from various backgrounds to do original work in this exciting area.


Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction

Zhu, Fengda, Chang, Xiaojun, Zeng, Runhao, Tan, Mingkui

arXiv.org Artificial Intelligence

Deep reinforcement learning has made significant progress in the field of continuous control, such as physical control and autonomous driving. However, it is challenging for a reinforcement model to learn a policy for each task sequentially due to catastrophic forgetting. Specifically, the model would forget knowledge it learned in the past when trained on a new task. We consider this challenge from two perspectives: i) acquiring task-specific skills is difficult since task information and rewards are not highly related; ii) learning knowledge from previous experience is difficult in continuous control domains. In this paper, we introduce an end-to-end framework namely Continual Diversity Adversarial Network (CDAN). We first develop an unsupervised diversity exploration method to learn task-specific skills using an unsupervised objective. Then, we propose an adversarial self-correction mechanism to learn knowledge by exploiting past experience. The two learning procedures are presumably reciprocal. To evaluate the proposed method, we propose a new continuous reinforcement learning environment named Continual Ant Maze (CAM) and a new metric termed Normalized Shorten Distance (NSD). The experimental results confirm the effectiveness of diversity exploration and self-correction. It is worthwhile noting that our final result outperforms baseline by 18.35% in terms of NSD, and 0.61 according to the average reward.